Genetic and antiretroviral drug resistance mutations analysis of reverse transcriptase and protease gene from Pakistani people living with HIV-1

Background Antiretroviral therapy (ART) effectiveness is compromised by the emergence of HIV drug resistance mutations (DRM) and can lead to the failure of ART. Apart from intrinsic viral factors, non-compliance with drugs and/or the use of sub-optimum therapy can lead to the emergence of DRMs. In Pakistan HIV currently exists as a concentrated epidemic, however, ART coverage is very low, and drug adherence is poor. ART is selected assuming without baseline genotyping. Pakistan has recently seen a rise in treatment failures, but the country’s actual burden of DRM is still unknown. In this study, we perform the genetic and drug resistance analysis of the pol gene from Pakistani HIV-positive ART-naïve and ART-experienced individuals. Methods In this study, HIV-1 pol was sequenced from 146 HIV-1 positive individuals, divided into ART-naïve (n = 37) and ART-experienced (n = 109). The sequences were also used to determine HIV-1 subtypes, the prevalence of DRM, and pol genetic variability. Results DRM analysis identified numerous DRMs against reverse transcriptase inhibitors in both ART-naïve and ART-experienced groups, including a few that are classified as rare. Additionally, the ART-experienced group showed mutations associated with resistance to protease inhibitors. Genetic analysis showed negative selection pressure in both groups, but a higher rate of evolution in the ART-naïve group. Conclusion High prevalence of DRMs, especially against previous first-line treatment in ART- naïve and the accumulation of DRMs in ART-experienced groups is concerning and warrants that a more extensive DRM survey be carried out to inform first-line and second-line ART regimen recommendations.

Introduction Antiretroviral therapy (ART) [1] has successfully increased the survival and quality of life of people living with HIV (PLHIV) globally. However, the effectiveness of ART is greatly compromised by the emergence of drug resistance mutations (DRM) in the HIV genome [2,3], leading to the failure of ART therapy [4]. The major viral factor contributing to HIV drug resistance is its error-prone reverse transcriptase activity that facilitates the occurrence of a large number of mutations [5]. Other viral factors contributing to the emergence of DRM include recombination between viral strains, subtype-specific polymorphisms, and the emergence of minor variants [6]. Additionally, non-viral factors such as non-adherence to therapy, use of sub-optimum regimens, and use of ineffective drugs (due to pre-existing DRMs in treatment naïve patients) also contribute to the emergence of DRM [6]. The latter warrants that baseline and periodic genotyping of patients be carried out to monitor the emergence of drug resistance and choose the most effective drug combination [7].
According to the World Health Organization (WHO), transmitted HIV drug resistance mutations are those that are observed in patients who are unexposed to ART, i.e. in ART-naïve patients [6]. Certain transmitted drug resistance mutations (TDRM) revert to wild type in absence of drug selection pressure, for example, T215Y/F. The occurrence of TDRMs in ARTnaïve patients is a major concern as it can limit treatment options and efficacy and can also lead to virological failure [8,9]. The prevalence of acquired and transmitted (primary) resistance varies geographically [10]. According to a recent report by WHO, drug resistance has greatly increased against efavirenz (EFV) and nevirapine (NVP), which were traditionally part of ART in 12 countries in Africa, Asia, and the Americas [11]. According to a WHO report (2017), the most common drug resistance mutations against reverse transcriptase inhibitors in high-prevalence countries, such as Cameroon, Namibia, Uganda, Zimbabwe, Argentina, Brazil, Colombia, Guatemala, Mexico, Nicaragua, and Myanmar were M41L, M184V/I, T215any, K103N/S, Y181 and G190any, among ART-naïve and M184V/I, K65R, K103N/S, Y181any and G190any among ART-experienced [6]. The protease inhibitor resistance mutations were extremely rare, with a frequency of 1.8% in ART-experienced and 0.2% in ART-naïve [6]. Conversely, in Kenya, 22% (14/22) of minor protease drug resistance mutation (L33I) was observed in naïve individuals [10]. Moreover, in Africa, Mexico, Central, and South America, and South-East Asia, the prevalence of DRMs associated with resistance to non-nucleoside reverse transcriptase inhibitor were found to be <10% only in ART-naïve individuals, while, it was considerably higher (21.6%) among ART-experienced [6].
In Pakistan, HIV exists as a concentrated epidemic, with high prevalence in certain highrisk groups, such as people who inject drugs (38.4%), followed by male sex workers (5.6%), female sex workers (2.2%), transgender sex workers (7.5%), men who have sex with men (5.4%) and transgender people (7.1%) [12,13]. The number of PLHIV has increased drastically since the beginning of the epidemic to 0.21 million in 2022, involving numerous HIV outbreaks such as the unprecedented 2019 HIV-1 outbreak in children that occurred in the Larkana district of Sindh province [14][15][16].
Currently, in Pakistan, the first-line regimen is a backbone of tenofovir (TDF) and lamivudine (3TC) with dolutegravir (DTG) though both zidovudine (AZT) and abacavir (ABC) may be used in special circumstances [17]. ART is prescribed in Pakistan without performing baseline or periodic sequencing, increasing the possibility that cases may be on ineffective therapy. The data on HIV DRMs in Pakistan is very limited; only two studies had reported a high prevalence of drug resistance mutations in ART-naïve and ART-experienced Pakistani PLHIV [18,19]. No data exist on DRMs in Pakistani PLHIV with virological failure.
The current study presents genetic and drug resistance analysis of samples collected from ART-naïve and ART-experienced (including virological failure) Pakistani PLHIV before the rollout of DTG-based regimen. It is important to mention that the analysis of DRMs associated with reverse transcriptase (RTIs) and protease inhibitors (PIs) continues to hold significance in PLHIV, for example detecting instances of drug resistance, in individuals that may not qualify for a switch or initiation on DTG due to reasons such as diabetes/hyperglycemia [20][21][22].

Study design and PLHIV profile
This was a cross-sectional study, where approximately 159 blood samples were collected from Pakistani PLHIV based on a convenient sampling strategy, involving all PLHIV visiting the treatment centers during 2019-20 (study period). The samples were collected from Bridge Consultants Foundation, Karachi, Aga Khan University, Karachi, and Shaukat Khanum Memorial Hospital, Lahore. Written informed consent was obtained from all study participants before carrying out any study procedures. The study was approved by the Aga Khan University Ethical Review Committee (2019-1124-3195). In addition, a questionnaire was used to obtain demographics and relevant clinical information from the study participants' medical records, which was used to further characterize the participants based on exposure, viral load, CD4 count and history of drug, etc.
Out of 159 samples, only 146 were used, while rest with short/poor sequence reads (obtained after sequencing) were removed from this study. The study participants were divided into two categories: ART-naïve (n = 37), described as those who had no history of ART at the time of sample collection, and ART-experienced (n = 109) who were on ART. The most recent viral loads of ART-experienced individuals were taken from their record, while viral loads for ART-naïve were performed as part of another project, and this record was also included in the study.

DNA extraction and PCR amplification
Viral nucleic acid was extracted from all the PLHIV samples by using Qiagen's QIAamp DNA blood mini kit according to the manufacturer's instructions. The DNA was stored at -80 C till further use. The Thermo Scientific NanoDrop 2000/2000c spectrophotometer was used to determine the quality and quantity of the DNA. The samples with the concentration �20 ng/ ul and ratio of 1.8 at 260/280 nm absorbance were selected for PCR.
It is important to mention that only protease and reverse transcriptase regions were evaluated in these PLHIV as none of the participants were on integrase inhibitors at the time of the study.

Sequence alignment and HIV-1 genotyping
The obtained sequences were aligned using MEGA 6 software version 7.0. The alignment was cleaned to remove non-nucleotide characters. For HIV-1 subtyping, the pol sequences were processed through the REGA HIV online subtyping tool [25].

Analysis of drug resistance mutations and site-specific genomic variability
HIV-1 pol gene (protease and reverse transcriptase) was analyzed using the Stanford University HIV Drug Resistance Database HIVdb algorithm [26,27] and confirmed using the 2019 Update of the Drug Resistance Mutations in HIV-1 by the International AIDS Society-USA (IAS-USA) [28]. The DRMs were classified as high, intermediate, low, and potential low-level resistance, based on the information described in the Stanford University HIV Drug Resistance Database HIVdb algorithm and IAS-USA report [26,29]. The prevalence of transmitted drug resistance was analyzed by using the surveillance drug resistance mutations (SDRM) list recommended by the WHO in 2009, among the ART-naïve PLHIV in our study cohort [30].

Genomic variability, selection pressure, and rate variation on DRM sites
Genomic variability on DRM sites in pol sequences was analyzed in ART-naïve and ARTexperienced groups using the Shannon entropy analysis tool available at the Los Alamos National Laboratory HIV Sequence Database [31]. We estimated the probability of DRM sites under negative and positive selection pressure in ART-naïve and ART-experienced groups using the FUBAR method [32]. Finally, we used the TreeRate tool using the Los Alamos National Laboratory HIV Sequence Database [33,34], to calculate the estimated evolutionary rate in ART-naïve and ART-experience groups, using the generalized midpoint optimization method [33,34]. The statistical significance of the difference between the mean evolutionary rate of the two groups was calculated using an unpaired T-test using GraphPad Prism tool, where p<0.05 was considered significant.

PLHIV profile
The demographic information collected from participants revealed that individuals belonged to different ethnic groups present in region of sample collection, including Punjabi (n = 42), Pathan (n = 24), Baloch (n = 20), and Muhajir (n = 15), along with other minorities (Table 1). Among these 146 study subjects, 125 were males and 21 were females, aged between 13 and 76 years (mean 36) ( Table 1). Eighty-nine participants were married, while 40 were unmarried and one had divorced (Table 1). In the ART-naïve group, three participants reported that their partners were HIV positive, while 12 reported their partners to be HIV negative. Similarly, in the ART-experienced group, 28 participants reported that their partners were HIV positive, while 35 reported their partners to be HIV negative; the rest of the participants were not aware of the HIV status of their partners (Table 1). Based on high-risk behavior, 65 participants were identified as PWID, followed by 11 and 10 participants with a history of sexual contact with CWSW (cisgender women sex workers) and MSM (men who have sex with men), respectively (Table 1).
Comparative subtype analysis between the two groups (ART-naïve and ART-experienced) revealed subtype A1 to be the predominant subtype in both groups ( Fig 1B). In addition, CRF_02AG, CRF_35AD, and 02A1 were predominant in the ART-naïve group, while, subtypes C, G, D, and A1G were predominant in the ART-experienced group (Fig 1B). Interestingly, unassigned subtypes, along with CRF_56cpx, were only observed in ART-experienced PLHIV with virologic failure (Fig 1B).
Interestingly, certain mutations classified as rare/unusual/other mutations by the HIV Stanford drug resistance database (1) were found in high frequency in both the study groups ( Table 2). Highly prevalent mutations observed in the ART-naïve group were: V179I (54.2%),  Table 2. Classification of the drug resistance mutations. The table describes the frequency and classification of all DRMs in terms of the nature of the mutation, and their association with drug resistance as described in the Stanford University HIV Drug Resistance Database HIVdb algorithm (1) and the International Aids Society (IAS) 2019 report (38). In the mutation classification column, * indicates mutations that are classified as major and minor DRMs or others (no role identified) by the Stanford drug resistance database, while ˣ indicates mutations that are classified as major or minor DRMs by the International AIDS Society. In the drug associated with resistance column, * shows the association of a mutation with resistance to a particular drug, as indicated in the Stanford drug resistance database; while ˣ shows the association of a mutation with a particular drug, as indicated by the International AIDS Society (IAS) 2019 report, while bold drug names indicates the major resistance by IAS 2019. The abbreviations of the antiretroviral drugs are as follows: PI; atazanavir/ritonavir (ATV/r), darunavir/ ritonavir (DRV/r), fosamprenavir/ ritonavir (FPV/r), indinavir/ ritonavir (IDV/r), lopinavir/ ritonavir (LPV/r), nelfinavir (NFV), tipranavir/ ritonavir (TPV/r). NRTI; stavudine (d4T), didanosine (DDI), zidovudine (AZT), emtricitabine (FTC), lamivudine (3TC), abacavir (ABC), tenofovir (TDF). NNRTI; efavirenz (EFV), nevirapine (NVP), etravirine (ETR), rilpivirine (RPV), doravirine (DOR). The abbreviations used to classify the mutations are as follows: P = polymorphic, NP = non polymorphic, M = Major, m = Minor, U = unusual, HU = highly unusual, N = none, O = other, ER = extremely rare, TAM = thymidine analogue mutation, H = hypermutation, R = rare, CP = common polymorphic.
Estimates of the evolutionary rates between the ART-naïve and ART-experienced groups suggested that the mean evolutionary rate of sequences in the ART-naïve group was significantly higher (0.001206 ± 0.00008259) than the ART-experienced group (0.001047 ± 0.00002795), and the difference in the mean evolutionary rate was statistically significant (p = 0.0445; Fig 3).

Discussion
In this study, we performed the genetic and drug resistance mutation analysis of pol from a cohort of ART-naïve and ART-experience in Pakistani PLHIV.
The ethnic profile of our study participants suggested that most participants were Punjabis. This finding is consistent with previous reports which showed Punjabi and Sindhi to be the predominant ethnic groups among the PLHIV in Pakistan [35]. The participant profiles also suggested that the majority of PLHIV were males and PWIDs. This finding is also supported by previous studies showing that male PWIDs comprise the largest group in terms of HIV prevalence, followed by MSM and Hijra Sex workers (HSWs) [35]. We observed a large number of individuals with discordant HIV-1 status in both groups. This is not surprising as social stigmatization has forced many PLHIV in Pakistan to hide their HIV status from their partners [36]. In our cohort, the majority of our participants had viral loads greater than 1000 copies/ ml in both ART-naïve (35%) and ART-experienced (41%) groups, respectively (Table 1) and can be categorized as possible virological failures if the observed viral load was >1000 copies/ Table 3. Drug resistance mutations present in study participants. The table shows SDRMs in column 2, marked with an asterisk (*), and frequency of DRMs (columns 3 and 4), entropy score (column 5), and selection pressure (column 6) for each drug resistance mutations sites in both ART-naïve and ART-experienced groups. In column 5, the grey shaded column indicates high entropy sites, the entropy score which was greater than the mean value characterized as high, and the entropy score less than the mean value characterized as low entropy in ART-naïve (mean = 0.28) and ART-experienced (mean = 0.4) group, while '0' represents no entropy. In column 6,-values and + values indicate the probability of negative and positive selection pressure on DRM sites, respectively, whereas, the bold and underlined values show high selection pressure between ART-naïve and ART-experienced groups.

Drugs
Mutations ARTnaive (%) ART-experienced (%)    € € € ml in two consecutive viral load test three months apart with adherence support following the first viral load measurement, as per WHO criteria [9]. As the ART adherence rate is poor [37], especially in the PWID which forms a large proportion of PLHIV these individuals may have drug-resistance viral strains, leading to therapy failure [37]. This is consistence with our observations where ART-experienced patients with high viral loads were found to be infected by drug-resistance recombinant strains of HIV [23,38]. Analysis of subtype prevalence revealed a high prevalence of subtype A1 and CRF_02AG in our cohort (Fig 1A). This analysis is consistent with the previous studies where subtype A has consistently been shown to be the predominant subtype in Pakistan [39, 40], while CRF_02AG is also being observed in many Pakistani PLHIV [41,42]. We also observed three unassigned recombinant forms in our cohort, each of which has been observed for the first time in Pakistan, indicating that the Pakistani epidemic is more diverse than previously thought, in terms of subtype composition [43,44]. Many PWIDs and MSM often exhibit overlapping high-risk behaviors, which increases the possibility of multiple strain acquisition and recombination between strains, allowing the emergence of complex and unique recombinant forms [43][44][45][46].

Entropy
Comparative subtype analysis between the two groups showed a high prevalence of subtype C (14.7%) in the ART-experienced group, which had higher viral loads (Fig 1B). Previous reports have shown that PLHIV infected with subtype C had a higher rate of virological failure than patients infected with other subtypes [47,48].
Analysis of DRMs in the ART-naïve group showed a high prevalence DRMs conferring resistance against NNRTI and NRTIs. Among NNRTI-associated DRMs, E138A and K103N, and among NNRTI-associated DRMs K219E and K70R were highly prevalent (Table 3). Additionally, a major PI-associated DRM, I84V was observed only in one patient (Table 3). Similarly, DRM analysis of the ART-experienced group showed a high prevalence of DRMs against NNRTI, and NRTIs. Among these, DRMs E138A, K103N, G190A, V179T, M184V, K219E, K70R, and D67N associated with resistance against NNRTI or NRTI were highly prevalent (Table 3). Additionally, in two ART-experienced patients, with high viral loads, the coexistence of DRM M46I and I54V was observed; this combination can cause resistance against all PIs except darunavir/r [27,49]. Similarly, I47V and V82M/A/S were seen in one patient each, conferring resistance against atazanavir/r, lopinavir/r, and darunavir/r ( Table 2).
The DRM E138A has previously been observed as a predominant mutation among ARTexperienced Pakistani PLHIV [18]. Similarly, DRM Y115F has also been reported in Pakistani PLHIV [19]. The high prevalence of E138A has also been observed in ART-naïve PLHIV from Interestingly, we observed a high frequency of certain mutations classified as rare or unusual mutations by the HIV Stanford drug resistance database [27], in both ART-naïve and ART-experienced groups ( Table 2). Among these mutations, V179I and K238E were observed in the reverse transcriptase region, while K20I, K20R, V82I, and L10I/V were observed in the protease region in high prevalence. These DRMs against PI were classified as resistance-conferring mutations in the International AIDs society report on DRMs [28]. A study by Lambert et al. from France reported a high prevalence of V179I in non-B subtype sequences from ARTnaive individuals [56]. Similarly, mutation V179I was also found in rilpivirine-treated samples in vitro [57], where the mutation V179I alone does not reduce susceptibility against rilpivirine but it can increase the rilpivirine resistance in combination with other mutations such as Y181C or L100I+Y181C [57]. Moreover, a clinical trial revealed that V179I mutation was not associated with an increased risk of virological failure [58]. Further studies are needed to determine the effect of these point mutations on drug susceptibility, especially in our part of the world where treatment options are limited and newer-generation NNRTIs are not available.
In the next stage, we performed genetic variability analysis of the DRM sites. Shannon entropy analysis of the DRM sites identified sites 20,103,138,179,184,210,219,221,225,227,230, and 238 in the ART-naïve group and sites 20,103,138,179,184,190, 219, 221, 230 and 238 in ART-experienced group to have higher than mean entropy (Table 2). Since high Shannon entropy in different regions of the genome correlates with instability [66], the regions with high entropy may be susceptible to further mutations or changes (such as purification of mutations) [66]. This finding is supported by a study from Onsongo et al., where they found high entropy scores for DRM sites in pol sequenced from ART-naive individuals [10]. In the next step, we performed a selection pressure analysis of the DRM sites. Selection pressure is the ratio of synonymous (nucleotide mutations that do not change the amino acid translation) and non-synonymous (nucleotide mutations that change the amino acid translation) mutations called dN/dS [67]. Ratio <1 indicates negative selection pressure, indicating that sites are under constrain, while positive selection (dN/dS ratio is > 1), indicates that amino acid changes are favored i.e., they increase the organism's fitness and the conditions favor the organism to adapt [67]. In this study, we found DRM sites 82, 184, and 219 to be under positive selection pressure, while the rest of the sites were under negative selection pressure ( Table 2). This negative selection and high entropy of most DRM sites indicate that the sites are under genetic constraints either because the replicative fitness of the virus is affected or they are in a transient phase of evolution and will further evolve to a more favorable state [68]. Finally, evolutionary rate analysis revealed that ART-naïve individuals had high mean evolutionary rate than ART-experienced individuals. The higher rate might indicate that the virus in ART-naïve is in the acute phase of infection, and the absence of ART allows the rapid expansion of the viral populations [69]. Furthermore, since most ART-naïve individuals exhibited high DRM prevalence, a high evolution rate might be an effort on the part of the virus to revert to the more stable form, in the absence of ART [69]. This hypothesis can be supported by the high entropy values of most DRM sites in ART-naïve individuals.
We anticipate certain limitations of the study including the small sample size due to the convenience sampling strategy and amplification of only reverse transcriptase and protease genes as none of the participants were on integrase inhibitors. A study on larger samples (from different HIV treatment centers across the country) including data on integrase resistance could shed light on the effectiveness of ART in Pakistan and provide an even more clear picture of the HIV-1 epidemic and drug resistance in Pakistan.
In conclusion, the increasing rates of DRM variants in our cohort are concerning, which eventually can lead to ART failure, particularly the first-line drugs [29,70]. Further studies are needed for surveillance of drug resistance mutations in newly infected ART naïve patients in Pakistan to encounter the virus more efficiently. This study also warrants the need for comprehensive genetic analysis, surveillance, and control of drug-resistant variants that if become epidemic strains can severely impact HIV treatment modalities as well as control measures.